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(54) A cache system for concurrent processes 

(57) A method of operating a cache memory is de- 
scribed in a system in which a processor is capable of 
executing a plurality of processes, each process includ- 
ing a sequence of instructions. In the method a cache 
memory is divided into cache partitions, each cache par- 
tition having a plurality of addressable storage locations 
for holding items in the cache memory. A partition indi- 



cator is allocated to each process identifying which, if 
any, of said cache partitions is to be used for holding 
items for use in the execution of that process. When the 
processor requests an item from main memory during 
execution of said current process and that item is not 
held in the cache memory, the item is fetched from main 
memory and loaded into one of the plurality of address- 
able storage locations in the identified cache partition. 
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Description 

The present invention relates to a cache system for operating between a processor and a main memory of a 
computer, and is particularly concerned with a processor capable of executing a plurality of concurrent processes. 
5 As is well known in the art, cache memories are used in computer systems to decrease the access latency to 

certain data and code and to decrease the memory bandwidth used for that data and code. A cache memory can delay 
aggregate and reorder memory accesses. 

A cache memory operates between a processor and a main memory of a computer. Data and/or instructions which 
are required by the process running on the processor can be held in the cache while that process runs. An access to 
70 the cache is normally much quicker than an access to main memory. If the processor does not locate a required data 
item or instruction in the cache memory, it directly accesses main memory to retrieve it, and the requested data item 
or instruction is loaded into the cache. There are various known system for using and refilling cache memories. 

In order to rely on a cache in a real time system, the behaviour of the cache needs to be predictable. That is, there 
needs to be a reasonable degree of certainty that particular data items or instructions which are expected to be found 
75 jn the cache will in fact be found there. Most existing refill mechanisms will normally always attempt to place in the 
cache a requested data item or instruction. In order to do this, they must delete other data items or instructions from 
the cache. This can result in items being deleted which were expected to be there for later use. This is particularly the 
case for a multi-tasking processor, or for a processor which has to handle interrupt processes or other unpredictable 
processes. 

20 it is an object of the present invention to provide a cache system which provides greater predictability of caching 

behaviour for a processor executing a plurality of concurrent processes. 

In this context, concurrent processes are considered to be processes which are executed by a common processor, 
but not necessarily simultaneously. That is, a first process may start to run and may be interrupted for some reason. 
The processor will then start to execute a second process but is ready to interrupt that when the first process is ready 

2S to run again or in response to some other prompt. This is managed by a process handler. It is important that data and/ 
or instructions associated with the first process are not evicted from the cache while the second process is running. 
Conversely, it is useful to allow the second process to have access to the cache while it is running. Consider for example 
the situation illustrated in Figure 8 where two processes, process A and process B are running concurrently on one 
CPU. Process A is scheduled first and while it has the CPU it may completely fill the data cache with its own data, 

30 evicting any data which has been placed in the data cache for process B. When control then swaps to process B, it 
may then reverse the state of the data cache, throwing out all of the data of process A and bringing in its own. This 
ping-ponging of data of date cache state is common between concurrent processes and is often detrimental to per- 
formance. 

According to one aspect of the present invention there is provided a method of operating a cache memory arranged 
35 between a processor and a main memory of a computer, the processor being capable of executing a plurality of proc- 
esses wherein each process includes a sequence of instructions, the method comprising: 

dividing the cache memory into cache partitions, each cache partition having a plurality of addressable storage 
locations for holding items in the cache memory; 
40 allocating to each process a partition indicator identifying which, if any, of said cache partitions is to be used for 

holding items for use in the execution of that process; and 

when the processor requests an item from main memory during execution of said current process and that item 
is not held in the cache memory, fetching the item from main memory and loading it into one of the plurality of 
addressable storage locations in the identified cache partition. 

45 

By allocating a partition indicator to each process, processes running concurrently on the processor are prevented 
from evicting each others data and/or instructions from the cache memory. That is, the cache partition allocated for 
example to a first process running on the processor cannot be overwritten by a subsequent, second process. Instead, 
the second process will have its own cache partition allocated to it. It is of course preferable that the allocation of 
so partition indicators to processes can be altered so that once the first process has completely finished, the cache partition 
which was allocated to it can then be allocated to another process. 

Depending on the needs of the process, it is possible to allocate more than one cache partition to a process or to 
deny a process access to the cache at all. 

In the described embodiment, the partition indicator for a current process which is being executed is held in a 
55 process status store which also holds status information about the process. This is referred to herein as the thread 
status word register. When a new process is to be executed by the processor, a new thread status word is loaded into 
the store with a new partition indicator allocated to that process. 

The partition indicator can be included in a group identifier for the process, the group identifier identifying an address 
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process. In a virtual addressing system, the processor issues addresses comprising a virtual page number 
and a Ime-.n page number and a translation look-aside buffer is provided for translating the virtual page number to a 
real page number for accessing the main memory. The translation look-aside buffer can thus also receive the group 
.dentrf.er and dome therefrom the partition indicator for the current process depending on the virtual address space 
*> which has been allocated to the process. 

The line-in page number of the items addressed can be used to identify the address storage location within the 
cache partrtion mto which the item is to be located. That is, each cache partition is direct-mapped. It will be apparent 
that rt .s not necessary to use all of the end bits of the item's address as the line-in page number, but merely a set of 
appropriate bits. These will normally be near the least significant end of the address. 
10 Once or more cache partitions may be allocated to a process. 

The system can include a cache access circuit which accesses items from the cache memory according to the 
address ,ni ma.n memory of said items and regardless of the cache partition in which the items is held in the cache 
memory. That .s, the partition indicator is only used on refill and not on look-up. Thus, a cached item can be retrieved 
is MreL^cT 8Ven SUbSeqUent t0 itS C8Chin9 ,hat partition 13 now al,ocated t0 a P ro «ss associated with a different 
According to another aspect of the invention there is provided a computer system comprising: 
a processor for executing a plurality of processes wherein each process includes a sequence of instructions the 

» E^cuted 9 a process status store Which ho,ds a parti,ion indica,or ,or a current process whk5h is curren,lv 

a main memory; 

a cache memory having a set of cache partitions, each cache partition comprising a plurality of addressable storage 
locations for holding items fetched from said main memory for use by the processor in execution of its processes; 

a cache ref.ll mechanism arranged to fetch an item from the main memory and to load said item into the cache 
memory at one of said addressable storage locations, wherein the cache refill mechanism selects said one of said 
addressable storage locations for loading said items in dependence on the partition indicator held in the process 
status store in association with the current process. 

Each process can include one or more sequence of instructions held at addresses in the main memory within a 
common page number. Cache partitions can be allocated to processes by associating each cache partition with page 
numbers of a particular process in the main memory. This is described in our earlier GB Application No 9701960 8 

As an alternative, a partition indicator can be held in the thread status word register and supplied directly to the 
cache refill mechanism. ' 

The number of addressable storage locations in each cache partition can be alterable. Also, the association of 
cache partmons to page numbers can be alterable while a process using these page numbers is being run by the 
processor. ° 7 

The following described embodiment illustrates a cache system which gives protection of the contents of the cache 
against unexpected eviction by reading from or writing to cache-lines from other processes whose data are placed in 
other partitions. It also provides a system in which the contents of the cache may be predicted 

For a better understanding of the present invention and to show how the same may be carried into effect, reference 
will now be made by way of example to the accompanying drawings in which: 

Figure 1 is a block diagram of a computer incorporating a cache system; 
Figure 2 is a sketch illustrating a four way set associative cache; 
Figure 3 is a block diagram of the CPU of Figure 1 ; 
Figure 4 is an example of an entry in a translation look-aside buffer; 
Figure 5 is a block diagram of the refill engine; 

Figure 6 is a diagram illustrating the operation of a multi-tasking processor; and 
Figure 7 is a diagram illustrating the alteration in caching behaviour for the system of Fiqure 6" 
Figure 8 illustrates a non-partitioned cache; and 
Figures 9 and 10 illustrate useful applications of the invention. 

Figure 1 is a block diagram of a computer incorporating a cache system. The computer comprises a CPU 2 which 
In th 0 rrPM o A,fK ad , d r? US 4 f ° r aCCeSSin9 itemS fr ° m a main ™ mo V 6 and t0 a data bus 8 turning items 
of items from the mam memory 6, whether or not they constitute actual data or instructions for execution by the CPU 
The system described herein is suitable for use on both instruction and data caches. As is know, there may be separate 
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data and instruction caches, or the data and instruction cache may be combined. In the computer described herein, 
the addressing scheme is a so-called virtual addressing scheme. The address is split into a line in page address 4a 
and a virtual page address 4b. The virtual page address 4b is supplied to a translation look-aside buffer (TLB) 10. The 
line in page address 4a is supplied to a look-up circuit 12. The translation look-aside buffer 10 supplies a real page 

5 address 14 converted from the virtual page address 4b to the look-up circuit 12. The look-up circuit 12 is connected 
via address and data buses 16,18 to a cache access circuit 20. Again, the data bus 18 can be for data items or in- 
structions from the main memory 6. The cache access circuit 20 is connected to a cache memory 22 via an address 
bus 24, a data bus 26 and a control bus 28 which transfers replacement information for the cache memory 22. A refill 
engine 30 is connected to the cache access circuit 20 via a refill bus 32 which transfers replacement information, data 

10 items (or instructions) and addresses between the refill engine and the cache access circuit. The refill engine 30 is 
itself connected to the main memory 6. 

The refill engine 30 receives from the translation look-aside buffer 10 a full real address 34, comprising the real 
page address and line in page address of an item in the main memory 6. The refill engine 30 also receives a partition 
indicator from the translation look-aside buffer 10 on a four bit bus 36. The function of the partition indicator will be 

is described hereinafter. 

Finally, the refill engine 30 receives a miss signal on line 38 which is generated in the look-up circuit 12 in a manner 
which will be described more clearly hereinafter. 

The cache memory 22 described herein is a direct mapped cache. That is, it has a plurality of addressable storage 
locations, each location constituting one row of the cache. Each row contains an item from main memory and the 

20 address in main memory of that item. Each row is addressable by a row address which is constituted by a number of 
bits representing the least significant bits of the address in main memory of the data items stored at that row. For 
example, for a cache memory having eight rows, each row address would be three bits long to uniquely identify those 
rows. For example, the second row in the cache has a row address 001 and thus could hold any data items from main 
memory having an address in the main memory which ends in the bits 001 . Clearly, in the main memory, there would 

25 be many such addresses and thus potentially many data items to be held at that row in the cache memory. Of course, 
the cache memory can hold only one data item at that row at any one time. 

Operation of the computer system illustrated in Figure 1 will now be described but as though the partition indicator 
was not present. The CPU 2 requests an item from main memory 6 using the address in main memory and transmits 
that address on address bus 4. The virtual page number is supplied to the translation look-aside buffer 10 which 

30 translates it into a real page number 14 according to a predetermined virtual to real page translation algorithm. The 
real page number 1 4 is supplied to the look-up circuit 1 2 together with the line in page number 4a of the original address 
transmitted by the CPU 2. The line in page address is used by the cache access circuit 20 to address the cache memory 
22. The line in page address includes a set of least significant bits (not necessarily including the end bits) of the main 
address in memory which are equivalent to the row address in the cache memory 22. The contents of the cache memory 

35 22 at the row address identified by the line in page address, being a data item (or instruction) and the address in main 
memory of the data item (or instruction), are supplied to the look-up circuit 12. There, the real page number of the 
address which has been retrieved from the cache memory is compared with the real page number which has been 
supplied from the translation look-aside buffer 10. If these addresses match, the look-up circuit indicates a hit which 
causes the data item which was held at that row of the cache memory to be returned to the CPU along data bus 8. If 

40 however the real page number of the address which was held at the addressed row in the cache memory 22 does not 
match the real page number supplied from the translation look-aside buffer 10, then a miss signal is generated on line 
38 to the refill engine 30. It is the task of the refill engine 30 to retrieve the correct item from the main memory 6, using 
the real address which is supplied from the translation look-aside buffer 10 on bus 34. The data item, once fetched 
from main memory 6 is supplied to the cache access circuit 20 via the refill bus 32 and is loaded into the cache memory 

45 22 together with the address in main memory. The data item itself is also returned to the CPU along data bus 8 so that 
the CPU can continue to execute. In a direct mapped cache memory as outlined above, it will be apparent that the 
data item and its address recalled from the main memory 6 will be loaded into the storage location from which the data 
item was originally accessed for checking. That is, it will be over-written into the only location which can accept it, 
having a row address matching the set of least significant bits in the line in page address in main memory. Of course, 

50 the page number of the data item originally stored in the cache memory and the data item which is now to be loaded 
into it are different. This "one to one mapping" limits the usefulness of the cache. 

To provide a cache system with greater flexibility, an n-way set associative cache memory has been developed. 
An example of a 4-way set associative cache is illustrated in Figure 2. The cache memory is divided into four banks 
B1 ,62,83,34. The banks can be commonly addressed row-wise by a common row address, as illustrated schematically 

55 for one row in Figure 2. However, that row contains four cache entries, one for each bank. The cache entry for bank 
B1 is output on bus 26a, the cache entry for bank B2 is output on bus 26b, and so on for banks B3 and B4. Thus, this 
allows four cache entries for one row address (or line in page address). Each time a row is addressed, four cache 
entries are output and the real page numbers of their addresses are compared with the real page number supplied 
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from the translation look-aside buffer 1 0 lo determine which entry is the correct one. If there is a cache miss upon an 
attempted access to the cache, the refill engine 30 retrieves the requested item from the main memory 6 and loads it 
into the correct row in one of the banks, in accordance with a refill algorithm which is based on. for example how lonq 
a particular rtem has been held in the cache, or other program parameters of the system. Such replacement algorithms 
are known and are not described further herein. v»w»«o 

Nevertheless, the n-way set associative cache (where n is the number of banks and is equal to four in Figure 2) 
white tang an .mprovement on a single direct mapped system is still inflexible and; more importantly, does not allow 
the behaviour of the cache to be properly predictable. 

The system described herein provides a cache partitioning mechanism which allows the optimisation of the com- 
puter s use of the cache memory by a more flexible cache refill system. 

Figure 3 is a schematic block diagram of a CPU 2 using the computer of Figure 1 The CPU 2 comprises an 
execution c.rcurt 1 5 which is connected to a fetch circuit 1 7 which is responsible foraddressing memory via the memory 
bus 4 and retrieving data an instructions via the data bus 8. A set of general purpose registers 7 is connected to the 
execution .circuit 15 for holding data and instructions for use in executing a process. In addition, a set of special registers 
are provided denoted by reference numerals 9. 11 and 1 3. There may be any number of special purpose registers 
and by way of example register 1 1 holds the instruction pointer which identifies the line of code which is currently being 

no ? f t? 0 "' Sp6Cial re9iS,er 9 h0ldS 8 thread S,a,us worc) "*"* de,ines ,he s,alus <* a P^ess being executed 
by the CPU 2. The execution circuit 15 is capable of executing one process or sequence of instructions at any one 
time. However, it is equally capable of interrupting that process and starting to execute another process before the first 
process has finished executing. There are many reasons why a process currently being executed by the execution 
c.rcurt 15 may be interrupted. One is that a higher priority interrupt process is to be executed immediately. Another is 
that the process be.ng executed is currently awaiting data with a tang latency, so that it is more efficient for the execution 
circurt to commence executing a subsequent process while the first process is waiting for that data When the data 
has been receded, the first process can be rescheduled for execution. The execution of concurrent processes is known 
per se and is managed by a process handler 19. 

Each process is executed under a so-called "thread" of control. A thread has the following state: 

an instruction pointer which indicates where in the process the thread has advanced to, 

a jump pointer which indicates where the process will branch to next, 

a set of general purpose registers 7 which contain immediately accessible values. 

the mapping of virtual addresses to physical addresses, 

the contents of memory accessible through the virtual addresses, 

control registers accessible by the thread, and 

optionally other values such as floating point rounding mode, whether the thread has kernel privileges etc. 

Some of the above state is specified by a small set of values which is referred to herein as thread status word and 
wh«ch .s held in the thread status word register 9. The thread status word specifically holds information about: 

whether the thread is in kernel mode or not, 

which virtual address space the thread can access, 

the floating point flags, trap enables and modes, 

debug information, and 



trap optimisation information. 



The format of the thread status word is defined in Table I. 
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10 



20 



Name 
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TSW.WATCH 


35 




Watchpoints enabled. | 


TSW. EN ABLE 


36 




Trap enable. I 




37-47 


11 


Reserved. I 


TSW.GROUP 


48-55 


8 


Group number. [ 




56-63 




Reserved. J 



As can be seen from Table I, the thread status word includes an 8 bit group number. This is used as described in 

the following to generate the partition indicator for allocating cache partitions. 
2s In the translation look-aside buffer 10 in the system described herein, each TLB entry has associated with the 

virtual page number, a real page number and an information sequence. The information sequence contains various 

information about the address in memory in a manner which is known and which will not be described further herein. 

However, according to the presently described system the information sequence additionally contains a partition code 

which generates a partition indicator PI dependent on the group number and the virtual page number. This is illustrated 
30 diagrammatically in Figure 4, where VP represents the virtual page number, RP represents the real page number, GN 

represents the group number and INFO represents the information sequence. In the described embodiment PI is four 

bits long. 

Thus, bits 0 to 3 of the information sequence INFO constitute the partition indicator. The partition indicator gives 
information regarding the partition into which the data item may be placed when it is first loaded into the cache memory 
35 22. For the cache structure illustrated in Figure 2, each part it ion can constitute one bank of the cache. In the partition 
indicator, each bit refers to one of the banks. The value of 1 in bit j of the partition indicator means that the data may 
not be placed in partition j. The value of 0 in bit j means that the data may be placed in partition j. Data may be placed 
in more than one partition by having a 0 in more than one bit of the partition indicator. A partition indicator which is all 
zeros allows the data to be placed in any partition of the cache. A partition indicator which is all ones does not allow 
40 any data items to be loaded into the cache memory. This could be used for example for "freezing* the contents of the 
cache, for example for diagnostic purposes. 

In the example given in Figure 4, the partition indicator indicates that replacement of data items may not use banks 
B1 or B3 but may use banks B2 or B4. 

it is quite possible to allocate more than one bank to a process. In that case, if the line in page address has more 
45 bits than the row address for the cache, the partitions would behave as a k-way set associative cache, where k partitions 
are allocated to a page. Thus, in the described example the process of Figure 4 can use banks B2 and B4. However, 
it may not use banks B1 and B3. 

The partition information is not used on cache look-up, but only upon cache replacement or refill. Thus, the cache 
access can locate data items held anywhere in the cache memory, whereas a replacement will only replace data into 
so the allowed partitions for that process. 

Figure 5 illustrates in more detail the content of the refill engine 30. The refill bus 32 is shown in Figure 4 as three 
separate buses, a data bus 32a, an address bus 32b and a bus 32c carrying replacement information. The address 
and data buses 32a and 32c are supplied to a memory access circuit 50 which accesses the main memory via the 
memory bus 54. The replacement information is fed to a decision circuit 52 which also receives the real address 34, 
55 the partition indicator P1 on bus 36 and the miss signal 38. The decision circuit 52 determines the proper partition of 
the cache into which data accessed the main memory is to be located. 

The cache partitioning mechanism described herein is particularly useful for a multi-tasking CPU. A multi-tasking 
processor is capable of executing concurrent processes, that is running more than one process "simultaneously". In 
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l o !1T h S f °' 3 Pr0C6SS Whe " that pf0CeSS iS halted ,or some 'eason. perhaps in need 
of date or a stimulus to proceed, the processor immediately begins executing another process. Thus, the processor is 

SsSrz when : nd r: dual processes ^ be hew up ^ « ^ < SSI 

l^Z ZTnZ drf,efent pr0CeSS6S P1 ' P2 ' P3 ' P4 - 0n the ri 9 ht h *nd side of Figure 5 is an illustration 

tat 0 Z ZT S6S ^.T * ,h6ir ^ ,0 bS he ' d in mem0ry - Thus ' the data «* ,he P^ess Pi are held on 
exami »h. Pf0Ce8S ™ "* hM ™ ^ 1 and 2 ' Da,a ,or P rocesses p 3 «d P* share page 3 In the 

o mTprLss mTSSSZ i "hT 688 P2 and then a first sequence of process P3 When the second se <Uce 

OJ the process P has been executed, the process Pi has been fully run by the processor It will readilv be aooarent 
that ,n a conventual cache system, once the processor has started executing the firs, seq Z S 
and » thus requesting accesses from page 1. the data items and instructions in these lines will relce in ^e cache 
the previously stored data items and instructions from page 0 

^^L ,h r e a S rt e , may S00n l 93 ' 0 be requifed When ,he second sec i uencs of process P1 is executed 
frornmrF^ure 7^rr m *" C *"' ^ ,he ,imin9 de,£ * S and certainties which can result 
TJS^J^ST^ Parting o. the cache while the processor is running process P1, and the change in 

P^ndR?^ 

P1 and P2. The process P1 may use banks B1 and B2 of the cache, but may not use banks B3 and B4 Conversely/ 
the process P2 may use banks B3 and B4, but not banks B1 and B2. This can be seen n the TLB entriro Thfe is 
process PI has a cache partition indicator ailowing i, to access banks B1 and B2. but no"^ and B4 Sess P2 has 
ache ^partrt.on.ndcators allowing them toaccess banks B3andB4 but notB1 and B2. Process P3 h^ a ^^^0' 
.ndrcator whrch prevents it from accessing the cache. Thus, any attempt by the processor to toad Ss from the 

exel, Z p o=Tj ^P1 Tt'l , Pr0CeSS ° r 18 n °' in,endin9 ,0 eXecute an V P art * tha P^ess P3 unti. it has finished 
execut ng process Pi. If ,t d,d for some reason have to execute P3, the only downside would be that it would have to 
make .ts accesses from direct memory and would not be allowed use of the cache 

When the process P1 has finished executing, the processor can request kernel mode to allow it to alter the cache 
partfton rndrcators for the processes. The manner in which this is done depends on how the parti oninq mechSm 

ThuT SEE ' ? 3b0Ve deSCrib6d eXamP ' e ' ' he Partitbn COde Can «» se < in *• TLB isz^ssr 
Thus, the partition codes are normally set by kernel mode software running on the CPU 2 However a user mav aler 
parens by requesting that the cache partitions be altered. In that event, the CPU 2 woukf c3 1 keme3 £ 

itT x h ° reqUe8t ' Cha T TLB en,ri6S aCCOrdin9 ^ and ,hen re,um to « he ™ a°.ow h ufer o 

Z S J S " 9 USer . h Can ^ er ,h6 Partrti ° mn9 behaviour of the cache - < hu * Providing much greater flexibility than 
has hrtherto been possible. The change is illustrated on the right hand side of Figure 6 Thus now the each martrtbn 
.ndicators prevent the process P1 from using the cache at a.l, but allocate banks B1 and B^ ioTe S ^3aS 

Sus y r„7 C3Che indiCat ° r <0r pr0CeSS6S P3 and P4 so ,hat » ca " — these baS the cache 

Thus when the processor .s expecting to execute the process P3. it now has a cache facility. 

tJ^^^^^T^ **" "** C ° nCU " m pr0cesses from evictin 9 each others data 

trom the data cache. That is, the processes are mapped to disjoint data cache partitions This effectively aives each 
processisownpr^ 

makes the, performance much easier ,0 predtot accurate*. The result of the system described £S 

critic^ T2TcT e T B Sy8, T d6SCribed her6in iS PartiCUlarly USe,ul is in ,he ^P'ementation of performance- 
cntical outmes. Often, there are a few routines whose performance is absolutely critical to the overall performance of 
the system A good example of this might be an interrupt service routine whfch. when called. 2 c ^ oduce Teflec 
a guaranteed (and usually short) length of time, .n these cases, cache partitions may be rJ£5^S££ 
and rnstructton caches for the data and code required for these important routines. TheLt of the ins. rucfon and 2 a 

F^r ' hen 66 8hared ° Ut am ° n9 ' he remainin9 processes - 10 illustrates a pS T^ZLT Tn 

Figure 10 an example ,s shown of reserving one 2 kilobyte partition in the data cache and one 4 kilobyte partS n 
the instruction cache for a performance critical interrupt service routine. ^ P 

It will be appreciated that the present invention is not restricted to the specifically described embodiment ah n™ 
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is provided in association with the particular process being executed. 

In the embodiment described above, a single cache access circuit 20 is shown for accessing the cache both on 
look-up and refill. However, it is also possible to provide the cache with an additional access port for refill, so that look- 
up and refill take place via different access ports for the cache memory 22. 
s In the described embodiment, the refill engine 30 and cache access circuit 20 are shown in individual blocks. 

However, it would be quite possible to combine their functions into a single cache access circuit which performs both 
look-up and refill. 

The following are possible alternatives for generation of the partition indicator in association with a particular proc- 
ess. 

10 in one alternative, the partition indicator may be placed directly in the thread status word TSW. For the described 

thread status word, this could be done by allocating the currently reserved bits 56 to 59 of the TSW to a new field TSW. 
PI. The value of the TSW. PI would then be passed to the refill engine 30 directly from the CPU. This would require a 
modification to the architecture illustrated in Figure 1 to connect the partition indicator bus 36 directly from the CPU 2 
to the refill engine 30, rather than from the TLB 10. For this implementation, the partition indicator PI is changed when 

is a new thread status word TSW is loaded for the next thread to be executed. This can be done by a particular setting 
instruction which sets the parameters of the thread status word. 

In another implementation, a table can be provided in the translation look-aside buffer 10 which maps the group 
number to the partition indicator PI without using the virtual address. This could be done by having a table indexed by 
group number which returns the partition indicator, or a table which has a group number/partition indicator pair which 

20 returns the partition indicator for the matching group. In this case, the architecture of Figure 1 would be unaltered, but 
a different table would be required in the translation look-aside buffer 10. In this implementation, the partition indicator 
could be changed using a "put" instruction which has two operands, the control register number to be changed and its 
new value. All control registers would be allocated a number which can be used to access it using this instruction, and 
so each entry in the group number/partition indicator table would have a unique control register number. 

25 

Claims 

1. A method of operating a cache memory arranged between a processor and a main memory of a computer, the 
30 processor being capable of executing a plurality of processes wherein each process includes a sequence of in- 
structions, the method comprising: 

dividing the cache memory into cache partitions, each cache partition having a plurality of addressable storage 
locations for holding items in the cache memory; 
35 allocating to each process a partition indicator identifying which, if any, of said cache partitions is to be used 

for holding items for use in the execution of that process; and 

when the processor requests an item from main memory during execution of said current process and that 
item is not held in the cache memory, fetching the item from main memory and loading it into one of the plurality 
of addressable storage locations in the identified cache partition. 

40 7 

2. A method according to claim 1 , comprising the step of holding in a store the partition indicator for a current process 
which is currently being executed. 

3. A method according to claim 2, wherein when a new process is to be executed by the processor, a new partition 
45 indicator allocated to that new process is loaded into said store. 

4. A method according to claim 2 or 3, wherein the store which holds the partition indicator for the current process 
is a process status store which also holds status information about the process. 

so 5. A method according to claim 1 , 2 or 3, wherein the partition indicator is included in a group identifier for the process, 
the group identifier identifying an address space for the process. 

6. A method according to claim 5, wherein the processor issues addresses comprising a virtual page number and a 
line-in page number and wherein a translation look-aside buffer is provided for translating the virtual page number 

55 to a real page number for accessing the main memory, the translation look-aside buffer also receiving the group 

identifier and deriving therefrom the partition indicator for the current process. 

7. A method according to any preceding claim, wherein the number of addressable storage locations in each cache 
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partition is alterable. 

8. A computer system comprising: 

a processor for executing a plurality of processes wherein each process includes a sequence of instructions 
the processor .ncludmg a process status store which holds a partition indicator for a current process which is 
currently being executed; 
a main memory; 

a cache memory having a set of cache partitions, each cache partition comprising a plurality of addressable 
storage locations for holding items fetched from said main memory for use by the processor in execution of 
its processes; and 

a cache refill mechanism arranged to fetch an item from the main memory and to load said item into the cache 
memory at one of said addressable storage locations, wherein the cache refill mechanism selects said one of 
said addressable storage locations for loading said rtems in dependence on the partition indicator held in the 
process status store in association with the current process. 

9. A computer system according to claim 8, wherein the partition indicator is included in a group identifier for each 
process which identifies an address space for the process. 

10. Ac°m^ 

and a line-.n page number and wherein the system comprises a translation look-aside buffer for translating the 
virtual page number to a real page number for accessing the main memory, the translation look-aside buffer being 
operable to receive said group identifier and to derive therefrom the partition indicator for the current process 
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